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WO 00/43507 

- 1 

METHOD FOR PRODUCING ANTIBODY FRAGMENTS 

FIELD OF THE INVENTION 

5 The present invention relates to an expression library 
comprising a repertoire of nucleic acid sequences derived from 
the natural sequence repertoire but modified to enhance the 
extent of sequence variability, each said nucleic acid sequence 
encoding at least part of a variable domain of a heavy chain 
10 derived from an immunoglobulin naturally devoid of light chains 
and its use in producing antibodies, or more particularly 
fragments thereof. 

In particular, the invention relates to a method for the 
15 preparation of antibodies or fragments thereof, having binding 
specificity for a target antigen which avoids the need for the 
donor previously to have been immunised with the target antigen. 

BACKGROUND TO THE INVENTION 

20 

Monoclonal antibodies, or binding fragments thereof, have 
traditionally been prepared using hybridoma technology (Kohler 
and Milstein, 1975, Nature 256 , 495) . More recently, the 
application of recombinant DNA methods to generating and 
25 expressing antibodies has found favour. In particular, interest 
has concentrated on combinatorial library techniques with the 
aim of utilising more efficiently the antibody repertoire. 

The natural immune response in vivo generates antigen-specific 
30 antibodies via an antigen-driven recombination and selection 
process wherein the initial gene recombination mechanism 
generates low specificity, low-affinity antibodies. These 
clones can be mutated further by antigen-driven hypermutation of 
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the variable region genes to provide high specificity, high 
affinity antibodies. 

Approaches to mimicking the first stage randomisation process 
5 which have been described in the literature include those based 
on the construction of 'naive' combinatorial antibody libraries 
prepared by isolating panels of immunoglobulin heavy chain 
variable (VH) domains and recombining these with panels of light 
variable chains (VL) domains (see, for example, Gram et al, 

10 Proc. Natl. Acad. Sa, USA, 89, 3576-3580, 1992). Naive 
libraries of antibody fragments have been constructed, for 
example, by cloning the rearranged V-genes from the IgM RNA of B 
cells of un-immunised donors isolated from peripheral blood 
lymphocytes, bone marrow or spleen cells (see, for example, 

15 Griffiths et al, EMBO Journal, 12(2), 725-734, 1993, Maries et 
al, J. Mol. Biol., 222, 581-597, 1991). Such libraries can be 
screened for antibodies against a range of different antigens. 

In combinatorial libraries derived from a large number of VH 
20 genes and VL genes, the number of possible combinations is such 
that the likelihood that some of these newly formed combinations 
will exhibit antigen-specific binding activity is reasonably 
high provided that the final library size is sufficiently large. 
Given that the original B-cell pairings between antibody heavy 
25 and light chain, selected by the immune system according to 
their affinity of binding, are likely to be lost in the 
randomly, recombined repertoires, low affinity pairings would 
generally be expected. In line with expectations, low affinity 
antibody fragments (Fabs) with KaS of 10 4 -10 5 M" 1 for a 
30 progesterone-bovine serum albumin (BSA) conjugate have been 
isolated from a small (5 x 10 6 ) library constructed from the 
bone marrow of non-immunised adult mice (Gram et al, see above) . 

Antibody fragments of higher affinity (KgS of 10 6 -10 7 M" 1 range) 
35 were selected from a repertoire of 3 x 10 7 clones, made from the 

2 
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peripheral blood lymphocytes of two healthy human volunteers 
(Marks et al, see above) comprising heavy chain repertoires of 
the IgM (naive) class. These were combined with both Lamda and 
Kappa light chain sequences, isolated from the same source. 
5 Antibodies to more than 25 antigens were isolated from this 
library, including self-antigens (Griffiths et al, see above) 
and cell-surface molecules (Marks et al, Bio/Technology, 11, 
1145-1149, 1993). 

10 The second stage of the natural immune response, involving 
affinity maturation of the selected specificities by mutation 
and selection has been mimicked in-vitro using the technique of 
random point mutation in the V-genes and selecting mutants for 
improved affinity. 

15 

Recently, the construction of a repertoire of 1.4 x 10 10 scFv 
clones, achieved by 'brute force' cloning of rearranged V genes 
of all classes from 4 3 non-immunised human donors has been 
reported (Vaughan et al 1996) and Griffiths et al, see above. 
20 Antibodies to seven different targets (including toxic and 
immunosuppressant molecules) were isolated, with measured 
affinities all below lOnM. 

The main limitation in the construction of combinatorial 
25 libraries is their size, which consequently limits their 
complexity. Evidence from the literature suggests that there is 
a direct link between library size and diversity and antibody 
specificity and affinity (see Vaughan et al, Nature 
Biotechnology, 14, 309-314, 1996), such that the larger (and 
30 more diverse) the library, the higher the affinity of the 
selected antibodies. 

The optimisation of binding affinity through random 
recombination of a heavy and light chain in combinatorial 
35 libraries is complicated by sequence variations in the two 
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framework regions, (i.e. the parts of the variable domains that 
serve as a scaffold in supporting the regions of 
hypervariability which are in turn termed the complementary 
determining regions or CDRs. 

5 

Only some combinations of framework sequences are compatible 
with the folding and interaction required for the correct 
orientation of the 6 CDRs that is necessary for good binding 
affinity. Consequently, conventional combinatorial libraries 
10 are likely to contain a high percentage of molecules that are 
non-functional . 

The affinity of antibodies may also be improved by the process 
of "chain shuffling", whereby a single heavy or light chain is 
15 recombined with a library of partner chains (Marks et al, 
Bio/Technology, 10, 779-782, 1992) . 

EP-A-0368684 (Medical Research Council) discloses the 
construction of expression libraries comprising a repertoire of 

20 nucleic acid sequences each encoding at least part of an 
immunoglobulin variable domain and the screening of the encoded 
domains for binding activities. It is stated that repertoires 
of genes encoding immunoglobulin variable domains are preferably 
prepared from lymphocytes of animals immunised with an antigen. 

25 The isolation of single VH domains having antigen binding 
activities, facilitated by immunisation, is exemplified (see 
Example 6) . 

These results illustrate that although the VH part alone of a 
30 classical antibody binding domain can exhibit binding activity, 
the specificity and affinity are generally very low. This may 
be explained by the absence of the functional involvement of the 
missing light chain such that only half of the intended binding 
pocket is present, leading to binding with related or homologous 
35 targets. 

4 
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EP-A-0368684 further describes the cloning of heavy chain 
variable domains with binding activities generated by 
mutagenesis of one or each of the CDRs. The preparation of a 
5 repertoire of CDR3s is described by using "universal" primers 
based in the flanking sequences, and likewise repertoires of 
other CDRs singly or in combination. These synthetic mutant VH 
clones can then be recombined with VL chains to produce a 
synthetic combinatorial library. 

10 

Construction of libraries by such synthetic recombinatorial 
techniques produces a repertoire of molecules that collectively 
exhibit a good degree of binding diversity, wherein the 
variability is focussed into the sequences that encode the CDRs 

15 of each chain. However this technique does not overcome the 
problems previously discussed with respect to random 
recombination of heavy and light chains and production of non- 
functional molecules. Furthermore there are still six separate 
regions (3 CDRs in VH and another 3 in VL) determining the 

20 binding capacity of the molecule hence the repertoire of 
possible binding variants is encoded within a rather diffuse 
stretch of coding sequence, thus making a focussed approach to 
altering the binding affinity of these binding domains very 
difficult. 

25 

There remains a continuing need for the development of improved 
methods for constructing libraries of immunoglobulin binding 
domains. In particular, it would be desirable to avoid the 
recombination of heavy and light chains thereby preventing the 
30 formation of molecules that are non-functional following 
recombination . 

It would also be an advantage to reduce the number of 
hypervariable residues in the binding domain as this would allow 
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a more complete repertoire of possible binding variants to be 
obtained. 

WO 94/4678, Casterman et al, describes immunoglobulins capable 
5 of exhibiting the functional properties of conventional (four- 
chain) immunoglobulins but which comprise two heavy polypeptide 
chains and which furthermore are devoid of light polypeptide 
chains. Fragments of such immunoglobulins, including fragments 
corresponding to isolated heavy chain variable domains or to 

10 heavy chain variable domain dimers linked by the hinge 
disulphide are also described. Methods for the preparation of 
such antibodies or fragments thereof on a large scale comprising 
transforming a mould or yeast with an expressible DNA sequence 
encoding the antibody or fragment are described in patent 

15 application WO 94/25591 (Unilever) . 

The immunoglobulins described in WO 94/467 8, which may be 
isolated from the serum of Camelids, do not rely upon the 
association of heavy and light chain variable domains for the 

20 formation of the antigen-binding site but instead the heavy 
polypeptide chains alone naturally form the complete antigen 
binding site. These immunoglobulins, hereinafter referred to as 
"heavy-chain immunoglobulins" are thus quite distinct from the 
heavy chains derived from conventional (four-chain) 

25 immunoglobulins. Heavy chains from conventional immunoglobulins 
contribute part only of the antigen-binding site and require a 
light chain partner, forming a complete antigen binding site, 
for optimal antigen binding. 

30 As described in WO 94/4678, heavy chain immunoglobulin V H 
regions isolated from Camelids (hereinafter VHH domains) which 
form a complete antigen binding site and thus constitute a 
single domain binding site differ from the V H regions derived 
from conventional four-chain immunoglobulins in a number of 

35 respects, notably in that they have no requirement for special 
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features for facilitating interaction with corresponding light 
chain domains. Thus, whereas in conventional (four-chain) 
immunoglobulins the amino acid residue involved in the V H /V L 
interaction is highly conserved and generally apolar leucine, in 
Camelid derived V H domains this is replaced by a charged amino 
acid, generally arginine. It is thought that the presence of 
charged amino acids at this position contributes to increasing 
the solubility of the camelid derived V H . A further difference 
which has been noted is that one of the CDRs of the heavy chain 
immunoglobulins of WO 94/4678, the CDR 3 , may contain an 
additional cysteine residue associated with a further additional 
cysteine residue elsewhere in the variable domain. It has been 
suggested that the establishment of a disulphide bond between 
the CDR3 and the remaining regions of the variable domain could 
be important in binding antigens and may compensate for the 
absence of light chains. 

cDNA libraries composed of nucleotide sequences coding for a 
heavy-chain immunoglobulin and methods for their preparation are 
disclosed in WO 94/4 678. It is stated that these 

immunoglobulins have undergone extensive maturation in vivo and 
the V region has naturally evolved to function in the absence of 
the light chain variable domain. It is further suggested that 
in order to allow for the selection of antibodies having 
specificity for a target antigen, the animal from which the 
cells used to prepare the library are obtained should be pre- 
immunised against the target antigen. No examples of the 
preparation of antibodies are given in the specification of 
WO 94/4678. The need for prior immunisation is also referred to 
in Arabi Ghahroudi et al (FEBS Letters, 414, (1997), 521-526). 

Davies et al (Bio/Technology, 13, 475-479, 1995) describe an 
approach to the construction of a library of binding domains 
based on a modified human VH domain which is intended to mimic a 
camelid VHH domain. This method involves replacement of the 
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sequence segment encoding one of the CDRs by random, synthetic 
sequences . Although it was possible to isolate domains with 
selected antigen binding properties from the resulting library, 
these were generally characterised by poor binding affinity and 
specificity for protein antigens. The results would not seem, 
therefore, to recommend the further application of this type of 
approach. 

The present invention relates to an expression library 
comprising a repertoire of synthetic or semi-synthetic nucleic 
acid sequences, not cloned from an immunised source, wherein 
said nucleic acid sequences are derived from immunoglobulins 
that are naturally devoid of light chains. 

SUMMARY OF THE INVENTION 

This invention is based on the unexpected finding that high 
affinity, high specificity antibodies or fragments thereof 
capable of binding either to protein or small molecule antigens 
can be obtained from a non-immunised camelid source provided 
that random mutagenesis of one or more CDRs is carried out, or 
that alternative combinations of existing CDRs are generated, in 
order to increase the extent of sequence variability in the 
antibody repertoire. 

The present invention therefore provides an expression library 
comprising a repertoire of nucleic acid sequences, which 
sequences are not cloned from an immunised source, each nucleic 
acid sequence encoding at least part of a variable domain of a 
heavy chain derived from an immunoglobulin naturally devoid of 
light chains wherein the extent of sequence variability in said 
library is enhanced compared to the corresponding naive 
expression library by introducing mutations in one or more of 
the complementarity determining regions (CDRs) of said nucleic 
acid sequences or by random recombination of fragments of said 
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nucleic acid sequences, thereby generating alternative 
combinations of CDR and framework sequences not naturally 
present in the naive library repertoire. 

BRIEF DESCRIPTION OF DRAWINGS 

Figure 1 shows a schematic representation of the assembly 
strategy expression for the library, wherein 

Step 1 illustrates the isolation of the framework regions. 
Step 2A-C illustrate attaching the variable CDR regions to FR2, 

FR3 and FR4 respectively. 
Step 3 involves linking the FR1 to the CDR1-FR2 encoding 

fragments. 

Step 4 involves linking the FR1-CDR1-FR2 to the CDR2-FR3 
fragments . 

Step 5A-D depicts the final 'full length' VHH fragment 
assemblies . 

Figure 2 illustrates variability designed into the CDR regions 
of a synthetic library (where variability is designed 
into all three CDR) and semi-synthetic library (where 
only the CDR3 varies) . 

DEFINITION OF TERMS 

The term "naive library" refers to a collection of nucleic acid 
sequences encoding the naturally occurring VHH repertoire from a 
non-immunised source (see, Example 1) . 

The term "synthetic library" refers to a collection of nucleic 
acid sequences herein referred to as synthetic nucleic acid 
sequences, encoding single heavy chain antibodies or fragments 
thereof in which all CDR regions have undergone some form of 
rearrangement . 



WO 00/43507 



PCT/EPOO/00296 



The term "semi-synthetic library" refers to a collection of 
nucleic acid sequences encoding single heavy chain antibodies or 
fragments thereof in which at least one CDR region retains 
natural variability and at least one CDR region has undergone 
some form of controlled rearrangement. Preferably in the semi- 
synthetic library the CDR to be randomised or mutagenised is the 
CDR- 3 . 

As used herein, the term "antibody" refers to an immunoglobulin 
which may be derived from natural sources or synthetically 
produced, in whole or in part. The terms "antibody" and 
"immunoglobulin" are used synonymously throughout the 
specification unless indicated otherwise. 

An "antibody fragment" is a portion of a whole antibody which 
retains the ability to exhibit antigen binding activity. 

The term "VHH" refers to the single heavy chain variable domain 
antibodies of the type that can be found in Camelid mammals 
which are naturally devoid of light chains; synthetic and naive 
VHH can be construed accordingly. 

The term "CDR" refers to the complementary determining region of 
the antibody structure. 

The term "library" refers to a collection of nucleic acid 
sequences . 

The term "repertoire," again meaning a collection, is used to 
indicate genetic diversity. 

The term "framework region" is used herein to refer to the 
nucleic acid sequence regions of an antibody molecule that 
encode the structural elements of the molecule. 
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The term "anchor regions" refers to nucleic acid sequences that 
show homology to part of the nucleic acid sequence of the 
framework region or class of framework regions, such that 
primers based on these anchor region sequences can be used to 
amplify the framework regions that are present in the naive 
library. In the present invention the primers based on the 
anchor regions were able to identify at least 10 6 different 
framework region sequences which could be divided into 5 
different classes of fragments. 

DETAILED DESCRIPTION OF THE INVENTION 

The invention is based on the unexpected finding that the 
development of an expression library consisting of VHH domains 
derived from an immunoglobulin naturally devoid of light chains 
in which one or more of the three CDRs have been modified to 
enhance the extent of their sequence variability provides an 
effective and superior source of high affinity and high 
specificity antibodies, or fragments thereof when compared to 
conventional dual chain antibody expression libraries. 

The heavy chain variable domains (VHH) for use according to the 
invention may be derived from any immunoglobulin naturally 
devoid of light chains, such that the antigen-binding capacity 
and specificity is located exclusively in the heavy chain 
variable domain. 

Preferably, the heavy chain variable domains may be obtained 
from camelids (as described in WO 94/4678, above) , especially 
Lamas (for example Lama Glama, Lama Vicugia or Lama Paccos) or 
from Camelus (for example Camelus dromedarius or Camelus 
bactrionus) . Suitable sources include lymphoid cells, 
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especially peripheral blood lymphocytes, bone marrow cells and 
spleen cells. 

In one aspect of the invention, the framework regions of the VHH 
5 domains may conveniently be derived from a naive library of VHH 
domains. This allows the natural variability in these sequence 
segments to be reflected in the expression library. To achieve 
this, the present invention has utilised information on 200 
clones selected at random from a naive library of VHH. This has 

10 allowed the identification of "anchor-regions" i.e. sequences 
which are highly conserved within the naive clones and which are 
thus able to provide the basis for the design of primers capable 
of amplifying most if not all sequence variants of the framework 
regions, present in the naive library. To the extent that the 

15 framework sequences have some variability in the naive library 
it is desirable also to retain this in the modified library 
according to the invention. 

This embodiment of the invention therefore provides for the use 
20 of a naive VHH expression library derived from a non-immunised 
source for the construction of a synthetic or semi-synthetic VHH 
expression library. 

Although in the present invention 200 clones have been randomly 
25 selected as a basis for framework primer design, it will be 
appreciated that it would be within the capacity of the person 
skilled in the art to design framework primers based on the 
homology of anchor regions with framework sequences taken from a 
much smaller number of clones, indeed it is not outside the 
30 capacity of the person skilled in the art to design framework 
primers from the sequence of a single naive cDNA clone. 
However, it remains the case that optimum anchor regions can 
most accurately be pinpointed by analysis of the homology of a 
large number of clones. 

35 
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Therefore taking sequence homology data obtained from the 
randomly selected individual clones cDNA framework primers were 
designed (see Table 1) ; 

i. to have sequences complementary to the anchor regions; 

ii. to have a melting temperature preferably of at least 50°C, 
to facilitate annealing. 

A large number of framework region clones could be generated in 
this way comprising 5 different types of fragments i.e. FR-l/FR- 
2A/FR-2B/FR-3/FR-4 . 

In another aspect of the invention, variability of the CDRs 
derived from the naive repertoire is enhanced by mutation of at 
least some of the residues, they comprise. Preferably this 
process introduces regions of random sequence into at least some 
of the CDRs. According to a particular embodiment, the CDR 
sequence in each individual clone is replaced by a synthetic 
nucleic acid sequence. Conveniently, the mutagenesis of the 
CDRs may be achieved by the method of overlap extension . using 
primers which contain at each end sequences that are 
complementary or homologous to the anchor regions that form the 
basis of the framework region primers listed in Table 1 and, in 
between, random or partly random sequences that will ultimately 
encode the CDR regions. The nucleic acid sequences of the 
synthetically modified CDR primers are listed in Table 2. 

It is important when designing the CDR primers also to take into 
account sequence homology within the CDR regions which was 
observed in the sequence data from the naive clones, as the 
amino acids concerned are thought to play a structural role in 
the VHH. It is desirable that highly conserved sequences within 
the CDRs, that is, residues that are conserved amongst a 
substantial proportion of the VHH domains in the naive 
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repertoire, should be retained in the synthetically modified 
primers, and excluded as targets for mutagenesis. 

Splicing by overlap extension follows: This is a modification 
of the polymerase chain reaction which has been used to generate 
gene fusions at very specific positions. It is based on the 
ability to fuse and amplify two DNA fragments containing 
homologous sequences i.e. 'anchors' around the fusion point. 

For the preparation of a 'synthetic' expression library, CDR 
primers incubated with framework region fragments will anneal at 
their complementary ends and fuse to generate randomised 
framework-CDR encoding fragments (see Figure 1, step 2A, B, C) . 
This process yields CDR-l/FR-2, CDR-2/FR-3 and CDR-3/FR-4 fusion 
fragments. Two of these fragments are then fused (see Figure 1, 
step 3) , and so forth (steps 4 and 5) . 

There then follows a denaturation step after which the fragments 
can be further annealed at the 'anchor-regions' and extended 
yielding the fused, double stranded gene product. If required 
this reaction can be followed by the PCR reaction amplifying the 
quantity of fused gene material. This method can easily be 
extended to fuse three or more fragments. 

Splicing by overlap extension allows the linking of the fusion 
fragments at specific positions to produce a fully assembled HC- 
V gene which can be cloned into a suitable phage display vector 
such as the vector pHEN.5 using restriction enzymes such as 
Sfil/Notl. 

It will be appreciated that other methods of introducing 
mutations, preferably including random or partially random 
sequences, into the CDRs would also be applicable. Such methods 
include, for example, cassette mutagenesis or the use of error- 
prone 'mutator' strains as bacterial hosts. 
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In one aspect, the present invention provides the use of a naive 
VHH expression library, not derived from an immunised source for 
the construction of a synthetic VHH expression library 
5 characterised in that all CDRs undergo a degree of sequence 
modification. 

In another aspect, only one or two of the CDRs in the heavy 
chain variable domain are provided with enhanced sequence 
10 variability by the introduction of random synthetic sequences. 
Thus for the preparation of a 'semi-synthetic' library the FR- 
1/CDR-1/FR-2/CDR-2/FR-3 genes from the naive library were 
assembled with CD-3/FR-4 fusion fragments and cloned into the 
phage display vector pHEN.5 as Sfil/NotI fragments. 

15 

This aspect of the invention therefore comprises the use of a 
naive VHH expression library derived from a non- immunised source 
for the construction of a semi-synthetic VHH expression library 
characterised in that one or two CDRs undergo a degree of 
20 sequence modification. 

In a further aspect of the invention, VHH domains with 
alternative combinations of three CDR sequences, which would not 
have been present in the unmodified naive library, may be 
25 generated by random recombination of fragments of VHH domain 
sequences derived from a naive library. Optionally, this 
recombination process may be combined with mutagenesis of one or 
more of the CDRs, along the lines discussed above. 

30 A further embodiment of the present invention resides in a 
method of preparing an expression library as disclosed above, 
comprising the steps of: 

(i) taking sequence data obtained from a number of cDNA clones 
35 randomly selected from a naive VHH library; 
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(ii) identifying a series of ^anchor regions' which show 
substantially conserved homology within said sequence data 
and on which basis framework primers with a capacity to 
amplify framework regions of the naive library target DNA 
can be constructed; 

(iii) amplification from a non-immunised source of a maximal 
number of different framework regions using primers from 
step (ii) ; 

(iv) combining the DNA sequences encoding each CDR present in 
the naive library, optionally modified by mutation or 
replacement, at least in part, by synthetic sequences, 
with framework primers showing anchor region homology to 
form a range of CDR primers ; 

(v) assembling nucleic acid sequences to create a VHH 
repertoire by random recombination of the range of CDR 
primers with the amplified framework regions using a 
technique of splicing by overlap extension allowing 
fragment fusion at the annealing anchor regions. 

In another aspect, the invention provides a method for the 
preparation of antibody fragments derived from a non-immunised 
source having specificity for a target antigen comprising 
screening an expression library as set forth above for antigen 
binding activity and recovering antibody fragments having the 
desired specificity. 

The nucleic acid sequences encoding the heavy chain variable 
domains for use according to the invention may conveniently be 
cloned into an appropriate expression vector with allows fusion 
with a surface protein. Suitable vectors which may be used are 
well known in the art and include any DNA molecule, capable of 
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replication in a host organism, into which the nucleic acid 
sequence can be inserted. Examples include phage vectors, for 
example lambda, T4 or filamentous bacteriophage vectors such as 
M13. Alternatively, the cloning may be performed into plasmids, 
such as plasmids coding for bacterial membrane proteins or 
eukaryotic virus vectors. 

The host cell may be prokaryotic or eukaryotic but is preferably 
bacterial, particularly E. coli. 

The expression library according to the invention may be 
screened for antigen binding activity using conventional 
techniques well known in the art as described, for example, in 
Hoogenboom, Tibtech, 1997 (15), 62-70. By way of illustration, 
bacteriophage displaying a repertoire of nucleic acid sequences 
according to the invention on the surface of the phage may be 
screened against different antigens by a fanning' process (see 
McCatterty, Nature, 348, (1990), 552-554) whereby the heavy 
chain variable domains are screened for binding to immobilised 
antigen. Binding phage are retained, eluted and amplified in 
bacteria. The panning cycle is repeated until enrichment of 
phage or antigen is observed and individual phage clones are 
then assayed for binding to the panning antigen and to uncoated 
polystyrene by phage ELISA. 

As an indication of the binding affinities of antibodies that 
result from the screening described in the invention; 
dissociation constants for the VHHs recognising a protein 
antigen will typically be less than lOOnM, preferably less than 
75nM, more preferred less than 50nM, still more preferred at 
less than 40nM, most preferred less than 25nM. 

The present invention therefore provides an expression library 
characterised in that superior binding affinity is achieved on 
screening than in many conventional dual chain antibody 
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expression libraries or single domain libraries based on 
variable domains derived from immunoglobulins which are not 
naturally devoid of light chains . 

The invention further provides the use of a non-immunised source 
of synthetic and semi-synthetic nucleic acid sequences encoding 
at least part of a variable domain of a heavy chain derived from 
an immunoglobulin naturally devoid of light chains to prepare an 
antibody, or fragment thereof, having binding specificity for a 
target antigen. 

By means of the invention, antibodies, particularly fragments 
thereof, having a specificity for a target antigen may 
conveniently be prepared by a method which does not require the 
donor previously to have been immunised with the target antigen. 
The method of the invention provides an advantageous alternative 
to hybridoma technology, or cloning from B cells and spleen 
cells where for each antigen, a new library is required. 

The present invention may be more fully understood with 
reference to the following description, when read together with 
the accompanying drawings . 

Example 1. Construction of the naive VHH library. 

1.1 Isolation of gene fragments encoding llama HC-V domains 
A blood sample of about 200ml was taken from an non- immunised 
Llama and an enriched lymphocyte population was obtained via 
Ficoll (Pharmacia) discontinuous gradient centrifugation. From 
these cells, total RNA was isolated by acid guanidium 
thiocyanate extraction (e.g. via the method described by 
Chomczynnski and Sacchi, Anal. Biochem, 162 , 156-159 (1987)). 
After first strand cDNA synthesis (e.g. with the Amersham first 
strand cDNA kit), DNA fragments encoding VHH fragments and part 
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of the long or short hinge region where amplified by PCR using 
specific primers: 

PstI 

V H - 2B 5 ' -AGGTSMARCTGCAGSAGTCWGG- 3 ' 

(see SEQ. ID; NO: 1) 

PCR. 162: 

Sfil 

5 ' -CATGCCATGACTCGCGGCCCAGCCGGCCATGGCCSAGGTSMARCTGCAGSAGTCWGG- 3 ' 

(see SEQ. ID. NO: 2) 

S =C and G, M = A and C, R = A and G , W =A and T, 

Hindi I I Not I 

Lam- 07 : 5 ' -AACAGTTAAGCTTCCGCTTGCGGCCGCGGAGCTGGGGTCTTCGCTGTGGTGCG-3 1 

(see SEQ. ID. NO: 3) 

Hindi I I Not I 

Lam- 08 : 5 ' -AACAGTTAAGCTTCCGCTTGCGGCCGCTGGTTGTGGTTTTGGTGTCTTGGGTT-3 ' 

(see SEQ. ID. NO: 4) 

Upon digestion of the PCR fragments with PstI (coinciding with 
codon 4 and 5 of the VHH domain, encoding the amino acids L-Q) 
and NotI (located at the 3 '-end of the VHH gene fragments), the 
DNA fragments with a length between 300 and 400bp (encoding the 
VHH domain, but lacking the first three and the last three 
codons) were purified via gel electrophoresis and isolation from 
the agarose gel. AfotI has a recognition-site of 8 nucleotides 
and it is therefore not likely that this recognition-site is 
present in many of the created PCR fragments. However, PstI has 
a recognition-site of only 6 nucleotides. Theoretically this 
recognition-site could have been present in 10% of the created 
PCR fragments, and if this sequence is conserved in a certain 
class of antibody fragments, this group would not be represented 
in the library cloned as Pstl-AfotI fragments. Therefore, a 
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second series of PCR was performed, in which the primary PCR 
product was used as a template (lOng/reaction) . In this 
reaction the 5 prime VH2B primer was replaced by PCR162. This 
primer introduces a Sfil recognition-site (8 nucleotides) at the 
5 prime end of the amplified fragments for cloning. Thus, a 
total of 24 different PCR products were obtained, four (short 
and long hinge, Pst I/Not I and Sfi I /Not I) from each Llama. 
Upon digestion of the PCR fragments with Sfil (upstream of the 
HC-V coding sequence, in the pelB leader sequence) and NotI, the 
DNA fragments with a length between 300 and 400bp (encoding the 
HC-V domain) were purified via gel electrophoresis and isolation 
from the agarose gel. 

1.2 Construction of HCV Library in pHEN.5 

The Pst I/Not I or Sfi I/Not I - digested fragments were 
purified from agarose and inserted into the appropriately 
digested pHEN.5 vector. Prior to transformation, the ligation 
reactions were purified by extraction with equal volumes of 
phenol/chloroform, followed by extraction with chloroform only. 
The DNA was precipitated by addition of 0.1 volume 3M NaAc pH5.2 
and 3 volumes ethanol. The DNA pellets were washed x2 with 1ml 
70% ethanol, dried and resuspended in lOul sterile milliQ water. 
Aliquots were transformed into electrocompetent E. coli XLl-Blue 
(Stratagene) by elect roporation, using a Bio-Rad Gene Pulser. 
The protocol used was as recommended by Stratagene. The final 
library, consisting of approximately 7.8xl0 6 individual clones, 
was harvested by scraping the colonies into 2TY + Ampicillin 
(lOOug/ml) + Glucose (2% w/v) culture medium (35-50ml each) . 
Glycerol stocks (30% v/v) and DNA stocks were prepared from 
these and stored at -80°C. 
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Example 2 Construction of the synthetic VHH library 

Building on sequence data obtained from 200 individual 
clones randomly selected from the naive library 'anchor 
regions' in the framework sequences, immediately flanking 
the CDR regions were identified. These anchor regions were 
selected based on their high degree of conserved residues 
and the ability of primers based on these sequences to 
amplify most if not all of the approx. 7.8xl0 6 framework 
regions present in the naive library. 

These anchor regions are used to amplify the framework regions 
individually yielding 5 different types/classes of fragments 
(Fl/F2/F2c/F3/F4 ) . The sequences of these primers are listed in 
Table 1. 

2.1 PCR Considerations 

All PCRs for amplification of the framework building blocks and 
assembly of the full length VHH genes were carried out using 
conditions and enzymes as described in Jesperson et al (1997). 
The conditions chosen utilised a mixture of Taq and Pfu (2 units 
and 1 unit, respectively in a lOOul reaction) . The proof- 
reading activity of Pfu minimises the introduction of errors 
and, more importantly, removes the non-templated nucleotides 
added at the 3' terminus of PCR products amplified by Taq 
Polymerase. The presence of these nucleotides would result in 
the introduction of point mutations at the junctions of each 
building block in the full length VHH gene when assembled by 
overlap extension reactions. 

2.2 Framework PCRs (Fl, F2, F2c, F3 and F4) 

Framework building blocks were amplified using naive library 
target DNA with the primers shown in Table 1. All framework and 
assembly DNA fragments were excised from agarose gels and 
purified by using the Qiaex extraction kit (Qiagen) . 
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2.3 Assembly Reactions 

CDR primers were designed on the basis of information derived 
from the analysis of naive library clone sequences, wherein 
largely conserved residues within the CDRs of such clones were 
assumed to perform and structural role and were maintained while 
sequence variability was designed in other areas. 

The CDR Primers used in the construction of the VHH synthetic 
and semi-synthetic libraries are shown in Table 2 and a 
schematic representation of the assembly process is shown in 
Figure 1. Figure 2 gives an overview of the length and 
variability allowed in the CDR regions. 

All primers used in CDR randomisation were synthesised leaving 
the trityl group on and full length primers were purified using 
cartridges and protocol supplied by Per kin Elmer. For every 
assembly step it was necessary to carry out 50-400 individual 
lOOul amplification reactions so that after Qiaex purification, 
enough material was available for the next step. 

Calculations of the amount of 'virgin' i.e. primary assembled 
non-amplified sequences required to represent all potential 
variability (contained within the NNKs) for the early steps in 
the assembly of the VHH gene were made. It was ensured that 
more than this amount of material was used for further 
amplification at each assembly step (See 1.7). 

Table 3 shows the entire matrix of fragments required for both 
libraries. For the synthetic library 70% of the library was 
made up of fragment H (which contains CD2-9) assembled onto all 
of the CD3 primer lengths, 20% fragment I (which contained CD2- 
10) on all of the CD3 lengths and 10% Fragment J on the longer 
CD3 lengths (14-24 amino acids) . The length distributions were 
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modelled on the lengths found within the 200 randomly selected 
naive VHH sequences. 

2.4 Initial Assemblies (A, B, C, D, E and K5 to K24) 

The primers encoding the randomised CDR regions were assembled 
with purified Framework DNA by overlap extension reactions, 
using large amounts of DNA and a small number of cycles to 
ensure maximum yields of ''virgin' i.e. non-amplified product. 
Assembly of CDR Primers onto purified framework DNA (Assemblies 
A to E and K5 to K24) was carried out for 10 cycles with 60ng of 
purified Framework DNA and 60 pmol of CDR Primer. Then a small 
aliquot was removed to estimate the amount of non-amplified 
material followed by a further 5 cycles with CDR primer (30 
pmol; lul) and the relevant outside primer (45 pmol; 3ul) for 
each assembly (i.e. F4 for A; F4c for B; F6 for C, D and E and 
170 for K5 to K24) . 

2.5 Initial assemblies for the semi-synthetic library (N) 

The fragment N (see Table 3) was amplified from target DNA (2ul) 
from each of 21 naive sub-libraries using primers Fl and F6 (45 
pmol in a lOOul reaction) . 

2.6 Intermediate Assemblies (F, G, H, I and J) 

For later assemblies F (A onto Fl) , G (B onto Fl) , H (F onto C), 
I (F onto D) and J (G onto E) 10 cycles of amplification were 
carried out using the conditions described above and 200ng of 
relevant assembly and framework/assembly (see Table 3) . Then a 
further 5 cycles were carried out with outside primers (3ul of 
15 pmol/ul) (288 and F2 for F; 288 and F2c for G; 288 and F6 for 
H, I and J) . 

2.7 Final Assemblies (P5-24, Q5-24, R5-24 and S5-24) 

For each assembly: lOOpl reactions were set up containing: 40- 
lOOng of K5 to K24 product plus 100-200ng H, I, J, or N 
(ensuring excess of the larger fragment in molar terms) . One to 
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four of these reactions was carried out per required assembly, 
depending on amount of final product needed. The total amount 
of 'virgin' material in 100-400pl total reactions was estimated 
and then 3ul of each outside primer {170, 288; 15pmol/\il) was 
5 added to each lOOul reaction and then amplification was 
continued for a further 10 cycles. Yields of 'virgin' full 
length assembled material (before any amplification with outside 
primers) for all 71 fragments ranged from 40ng to 400ng, 
depending upon amounts and efficiency. This was more than 
10 sufficient to represent all potential variants as single 
trans formants as 300ng of a 400-500 base fragment is more than 
10 11 individual molecules (see Example 2.1 for the total size of 
the libraries) . 

15 2.8 Scale-up amplification of full length VHH for cloning 

The final number of PCRs required to give the desired mixture 
amount for cloning of any particular fragment was calculated 
based on the natural distribution of CDR length and composition 
(Table 4) . The actual mixture distribution was influenced to a 

20 certain extent by different yields obtained for different 
fragments . 

Large-scale amplification reactions were carried out by adding 
5pl-20pl of the first PCR reactions, (after 10 cycles overlap 

25 extension and 10 cycles outside primers) to 170/288 PCRs (- 25 
cycles with 2 mins elongation at 72 degrees) . Total final 
amplifications of the 71 fragments (51 for the Synthetic library 
and 20 for the Semi-Synthetic library) were pooled, precipitated 
and run on 1% agarose gels; the full-length VHH gene product was 

30 excised and purified by Qiaex extraction. Yields after 
purification were approx. 1 pg per lOOpl PCR Reaction. 

2.9 Construction of the VHH Libraries in pHEN.5 
The 71 gel-purified DNA fragments encoding the VHH genes for the 
35 libraries were digested with Sfil (upstream of the VHH coding 
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sequence, in the pelB leader sequence) and AfotI (located at the 
3' -end of the VHH gene fragments). Transformation of lOOpl of 
electrocompetent XLl-Blue (Strategene) E. coli with 5pg of 
digested, purified pHEN.5 vector and 0.5ug of purified, digested 
5 test fragment (close to saturation) gave approx. 8xl0 7 
transformants. 1000 x lOOul electroporations were split between 
the two libraries. 

The amount of purified, digested fragment was estimated for each 
10 before insertion into Sfil/NotI digested pHEN.5 and the number 
of ligation reactions for each fragment was calculated depending 
upon the amount of components in the mix. Prior to 
transformation, the ligation reactions were purified by 
extraction with equal volumes of phenol/chloroform, followed by 
15 extraction with chloroform only. The DNA was precipitated by 
addition of 0.1 volume 3M NaAc pH 5.2 and 3 volumes ethanol. 
The DNA pellets were washed x2 with 1ml 70% ethanol, dried and 
resuspended in lOul sterile milliQ water. Aliquots were 
transformed into electrocompetent E. coli XLl-Blue (Stratagene) 
20 by electroporation, using a Bio-Rad Gene Pulser. The protocol 
used was as recommended by Stratagene. 

2.10 Large-scale phage rescue of synthetic VHH Libraries 

The final libraries consisted of 6xl0 10 individual clones 

25 (Synthetic library) and 4.4xl0 10 clones (Semi-Synthetic library). 
The numbers of transformants for each individual fragment were 
calculated by pooling and titering of recovered transformants. 
Transformed XLl-Blue were then harvested in solution by 
increasing the volume of each fragment pool by 10-fold with 2TY 

30 containing Ampicillin (lOOpg/ml) and Glucose (2% w/v) followed 
by 3 hours of growth at 37°C with shaking. Half of this total 
culture volume was pooled, grown for a further 2 hours at 37 °C 
with shaking, spun and stored as concentrated glycerol stocks. 
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The remaining half of the library was diluted 2-fold with 
2TY/Ampicillin/Glucose and helper phage were added. 

Super-infection was allowed to occur for 30 rains at 37 °C without 
5 shaking, followed by incubation with shaking for 30 mins. The 
10 litres were spun at 5000 rpm for 20 mins and the pellets were 
pooled. Each pellet was resusupended in 20mls of 2TY containing 
Ampicillin (lOOug/ml) and Kanamycin (50ug/ml) and pooled into 
one 2L flask. 

10 

The final volume was made up to 20L with 2TY/Amp/Kan and the 
cultures were grown overnight, shaking at 37°C. Phage were 
harvested by two consecutive precipitations with 1/5 volume 20% 
Polyethylene Glycol 8000, 2.5 M NaCl and several aliguots were 
15 used directly for panning (see below) . The remaining phage 
aliquots (180x1ml for each library) were resuspended in PBS/30% 
glycerol and stored at -80°C. 

Example 3 Selection of VHH fragments with binding affinity 

20 

3.1 Panning of the library using solid phase immobilised 
antigens 

Antigens: 

25 Five 'antigens' were used to screen the synthetic library. 
These were Lactate Oxidase (LOX) , Starch Branching Enzyme II 
(SBE II), Classical VHH antibody, a mix of Polyphenols, and Haem 
conjugated to Bovine Serum Albumin. Panning of phages 

displaying VHHs was carried out as described below. 

30 

Panning and Phage Rescue 

Aliquots of phage (lml; ~10 13 phage particles resuspended in 2% 
Marvel [plus 2% OVA or BSA] ) from the large scale phage rescue 
(see above) were incubated overnight with the relevant 
35 sensitised panning tubes (Nunc Maxisorb Immunosorb) . Unbound 
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phage were removed by washing the tube 20 times with PBS-T 
followed by 20 washes with PBS. The bound phages were eluted by 
adding lmL elution buffer (0.1M HCL/glycine pH 2.2/lmg/mL BSA) . 
The elution mixture was neutralised with 60uL 2M Tris, and the 
5 eluted phages were added to 9mL log-phase E. coli XL-1 Blue. 
Also 4mL log-phase E. coli XL-1 Blue were added to the 
immunotube. After incubation at 37 °C for 30 minutes to^ allow 
infection, the lOmL and 4mL infected XL-1 Blue bacteria were 
pooled and plated onto SOBAG plates. Following growth overnight 
10 at 37 °C the clones obtained from the antigen sensitised tubes 
were harvested and used as starting material for the next round 
of panning, or alternatively individual colonies were assayed 
for specific antigen binding activity. 

15 To continue panning, an aliquot (150ul) of the overnight culture 
was added to 15ml of 2TY containing Ampicillin (lOOug/ml) and 
Glucose (2% w/v) and allowed to grow until log-phase (A 600= 
0.3-0.5), at which point 4.5xl0 9 pfu M13K07 helper phage were 
added. After infection for 30 minutes at 37 °C (without shaking) 

20 the infected cells were spun down (5000 rpm for 10 minutes) and 
the pellet was resuspended in 200mL 2xTY/Amp/Kan. After 
incubation with shaking at 37 °C overnight, the culture was spun 
and the phages present in the supernatant were precipitated by 
adding 1/5 volume PEG/NaCL (20% Polyethylene glycol 8000, 2.5M 

25 NaCL) . After incubation on ice-water for 1 hour the phage 
particles were pelleted by centrifugation at 8000 rpm for 30 
minutes. The phage pellet was resuspended in 20mL water and re- 
precipitated by adding 4mL PEG/NaCl solution. After incubation 
in ice-water for 15 minutes the phage particles were pelleted by 

30 centrifugation at 5000 rpm for 15 minutes and resuspended in 2mL 
PBST with 2% Marvel (plus 2% OVA or BSA) . Panning results are 
outlined in Table 5. 
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Example 4 Identification of individual HC-V fragments with 
binding activity 

Individual bacterial colonies were picked (48 from pans 2 and 3, 
5 for all antigens) using sterile toothpicks and added to the 
wells of 96-well microtitre plates (Sterilin) each containing 
lOOul of 2TY, 1% (w/v) glucose and ampicillin (lOOmg/ml) . After 
allowing the cultures to grow o/n at 37°C, 20ul aliquots from 
each well of these 'masterplates' were added to the wells of 

10 fresh microtitre plates each containing 200ul of 2TY, 1% 
glucose, lOOmg/ml ampicillin, 10 9 M13K07 helper phage. 
Infection at 37 °C for 2.5h was followed by pelleting the cells 
and resuspending the infected cells in 200pl of 2TY containing 
ampicillin (lOOmg/ml) and kanamycin (25mg/ml) . Following o/n 

15 incubation at 37°C, the phage-containing supernatants (lOOul) 
were added to the wells of Sterilin microtitre plates containing 
lOOpl/well of the appropriate blocking buffer (same buffer used 
as during panning reactions) . Pre-blocking of the phage was 
carried out in these plates for 30 mins at room temp. After 30 

20 minutes at RT, lOOul of phage supernatant was added to the wells 
of a Greiner High Bind ELISA plate coated with the corresponding 
antigen, and to the wells of an uncoated plate. After 2h 
incubation at 37 °C unbound phages were removed, and bound phages 
were detected with rabbit anti-M13 (in house reagent) followed 

25 by incubation with a goat anti-rabbit alkaline phosphatase 
conjugate. The assays were developed with lOOul/well of p- 
nitrophenyl phosphate (lmg/ml) in 1M diethanolamine, ImM MgCl 2 , 
pH 9.6 and the plates were read after 5-10 mins at 410nm. 
Results are outlined in Table 6. 

30 

Example 5 Identification of individual VHH binders using 
solution phase panning 

An alternative method has been used for the screening of the 
35 semi-synthetic library using the enzyme amylase as a target. 
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The antigen was biotinylated and panned using streptavidin- 
coated magnetic beads. High affinity binders were then selected 
by dropping the antigen concentration from lOOnM to 30nM. A 
panel of highly specific antibody fragments were isolated and 
5 the Kd values determined using Pharmacia BiaCore SPR technology. 
The VHHs were captured on an NTA sensor chip via the Histidine- 
tag present at the C-terminus. Various concentrations (5 to 
200nM) of amylase were then passed over the sensor chip and the 
equilibrium constants were measured using the standard 
10 evaluation software. Results are outlined in Table 7. 

Table 1. Framework primers 



Code 


Description 


Sequence 


Fl 


PelB leader 


ATT GCC TAC GGC AGC CGC TG 


F2 


3'FR-l 


TCC AGA GGC TGC ACA GGA GA 


F3 


5' FR-2 


TGG T(A/T)C CGC CAG GCT CCA GG 


F4 


3' FR-2 


GC GAC (C/A)AA CTC GCG CT(G/C) CTT 


F3c 


5' FR-2c 


TGG TTC CGC CAG GCC CCA GG 


F4c 


3' FR-2c 


ACA TGA GAC C(C/G)C CTC (G/A) CG CTC 


F5 


5' FR-3 


TAT GCA GAC TCC GTG AAG GGC CG 


F6 


3' FR-3 


ACA GTA ATA (G/A) AC GGC CGT GTC 


F7 


5' FR-4 


TGG GGC CAG GGG ACC C(A/T)G GTC 


1 F8 


myc-tag 


TTC AGA TCC TCT TCT GAG ATG AG 



15 
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Table 2. CDR primers 





nocrri nt* "i or* 

UCOWL lULlvlI 




CD-I 


CDR-l 


T(T/C) TCC TGT GCA GCC TCT GGA AG(T/C/A) 
A(C/T) (C/T) TT(T/C/G) AG(T/C/A) NNK NNK NNK 
ATG GGT TGG T(A/T)C CGC CAG GCT CCA GG 


CD-la 


CDR-l 
(5 -3 ) 


T(T/C) TCC TGT GCA GCC TCT GGA G/T/A)TC 

A (C/T) (C/T) TT(T/C/G) GAT NNK TAT NNK ATT GGT 

TGG TTC CGC CAG GCC CCA GG 


CD-2-9 


CDR-2 
9 a . a . 


AAG (C/G)AG CGC GAG TT (T/G) GTC GCA NNK ATT 
(T/A)CT NNK GGT GGT NNK ACA NNK TAT GCA GAC 
TCC GTG AAG GGC CG 


CD-2-10 


CDR-2 
10 a. a. 


AAG (C/G)AG CGC GAG TT (T/G) GTC GCA NNK ATT 
(T/A)CT NNK NNK GGT GGT NNK ACA NNK TAT GCA 
GAC TCC GTG AAG GGC CG 


CD-2-lOc 


CDR-2c 


GAG CG(C/T) GAG G(G/C)G GTC TCA TGT TGT ATT 
(T/A)CT NNK NNK GAT GGT NNK ACA NNK TAT GCA 
GAC TCC GTG AAG GGC CG 


CD-3-2 


CDR- 3 
5 a. a . 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK TAC TGG GGC CAG GGG ACC CA(G/T) 
GTC 


CD-3-3 


CDR-3 
b a . a . 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK TAC TGG GGC CAG GGG ACC 
CA(G/T) GTC 


CD-3-4 


CDR-3 

7 a a 


GAC ACG GCC GT (C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK TAC TGG GGC CAG GGG ACC 
CA(G/T) GTC 


CD-3-5 


CDR-3 
8 a . a . 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK TAC TGG GGC CAG GGG 
ACC CA(G/T) GTC 


CD-3-6 


CDR-3 
9 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK TAC TGG GGC CAG 
bbb ALC LA(G/T) GTC 


CD-3-7 


CDR-3 
10 a. a. 


GAC ACG GCC GT (C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK TAC TGG GGC 
CAG GGG ACC CA(G/T) GTC 


CD-3-8 


CDR-3 
11 a. a. 


GAC ACG GCC GT (C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK TAC TGG 
GGC CAG GGG ACC CA(G/T) GTC 


CD-3-9 


CDR-3 
12 a. a. 


GAC ACG GCC GT (C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK TAC 
TGG GGC CAG GGG ACC CA(G/T) GTC 


CD-3-10 


CDR-3 
13 a. a . 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
TAC TGG GGC CAG GGG ACC CA(G/T) GTC 


J CD-3-11 


CDR-3 
14 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK TAC TGG GGC CAG GGG ACC CA(G/T) GTC 
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Code 


Description 


Sequence 


CD- 3- 12 


CDR-3 
15 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 

f~'t~T* XTMT'j' XIMT^ inlt* \1\TT/ VT\TT/ 11\1T/ XTk1T.» VTYTTj' VTVIT/ 

GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK TAC TGG GGC CAG GGG ACC CA(G/T) GTC 


CD-3-13 


CDR-3 
16 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK TAC TGG GGC CAG GGG ACC CA(G/T) 
GTC 


CD-3-14 


CDR-3 
17 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK NNK TAC TGG GGC CAG GGG ACC 
CA(G/T) GTC 


CD-3-15 


CDR-3 
18 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A) T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK NNK NNK TAC TGG GGC CAG GGG ACC 
CA(G/T) GTC 


CD-3-16 


CDR-3 
19 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A) T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK NNK NNK NNK TAC TGG GGC CAG GGG 
ACC CA(G/T) GTC 


CD-3-17 


CDR-3 
20 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK NNK NNK NNK NNK TAC TGG GGC CAG 
GGG ACC CA(G/T) GTC 


CD-3-18 


CDR-3 
21 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK NNK NNK NNK NNK NNK TAC TGG GGC 
CAG GGG ACC CA(G/T) GTC 


CD-3-19 


CDR-3 
22 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK NNK NNK NNK NNK NNK NNK TAC TGG 
GGC CAG GGG ACC CA(G/T) GTC 


CD-3-20 


CDR-3 
23 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK TAC 
TGG GGC CAG GGG ACC CA(G/T) GTC 


CD-3-21 


CDR-3 
24 a. a. 


GAC ACG GCC GT(C/T) TAT TAC TGT G/A) (C/A)T 
GCC NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK NNK 
TAC TGG GGC CAG GGG ACC CA(G/T) GTC 



N=G/A/T/C K=G/T 
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Table 3. Steps in the Assembly of the Synthetic VHH Library 
(also see Figures 1 and 2) 



Assembly Name 


Step Required 


Synthetic Library 


A 


CD1 onto F2 


B 


CDlc onto F2c 


C 


CD2-9 onto F3 


D 


CD2-10 onto F3 


E 


CD2-10c onto F3 


F 


CD1-F2 (A) onto Fl 


G 


CDlc-F2c (B) onto Fl 


H 


F1-CD1-F2 (F) onto CD2-9-F3 (C) 


I 


F1-CD1-F2 (F) onto CD2-10-F3 (D) 


J 


Fl-CDlc-F2c (G) onto CD2-10c-F3 (E) 


K5 to K24 


CD3-5, CD3-6, CD3-7 up to CD3-24 onto F4 


P5 to P24 


H onto K5 to K24 


Q5 to Q24 


I onto K5 to K24 


R5 to R24 


J onto K5 to K24 


Semi-Synthetic Library 


N 


F1-CD1-F2-CD2-F3 amplified from naive library 
DNA using Fl and F6 


S5 to S24 


N onto K5 to K24 
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Table 4 . Number of Scale-up PCRs for Final Assemblies 



10% 



Semi- 
synthetic 





CD 
3-S 


CD 
3-6 


7 


B 


9 


10 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 


21 


22 


23 


24 


H 


5 


5 


10 


30 


50 


55 


60 


70 


B0 


85 


90 


85 


70 


55 


40 


20 


10 


10 


5 


5 


I 


5 


5 


5 


10 


10 


15 


20 


20 


25 


30 


25 


20 


15 


10 


10 


5 


5 


5 


5 


5 


J 




10 


10 


10 


10 


10 


10 


10 


10 


10 


10 


10 


TT" 


10 


10 


20 


SO 


60 


70 


so 


90 


100 


120 


130 


110 


90 


80 


60 


40 


20 


20 


10 


10 
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Table 5. Synthetic (S) and Semi-Synthetic (SS) Library 
Panning Results 



Panning Antigen 


Pan 1 


Pan 2 


Pan 3 


Pan 4 


Pan 5 


LOX (SS) 


ND 


2- fold 


12- fold 






SBE II (S) 


ND 


NONE 


NONE 


8-fold 


8-fold 


HCV-Classic (SS) 


ND 


1200-fold 


10000-fold 






Polyphenols (SS) 


ND 


52-fold 


32-fold 






Haem (SS) 


ND 


500-fold 


1000-fold 







5 

Table 6. Percentage of VHH phage clones which specifically 
recognise immobilised antigen 



Panning Antigens 


Pan 2 


Pan 3 


Pan 4 


Pan 5 


LOX (SS) 


2% 


38% 






SBE II (S) 


0% 


4% 


48% 


60% 


HCV-Classical (SS) 


31% 


54% 






Polyphenols (SS) 


23% 


9% 






| Haem (SS) 


77% 


79% 
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Table 7. Equilibrium Constants for VHHs recognising Amylase 



VHH Number 


Equilibrium Constants (nM) 


11G 


6 


12A 


11 


3A 


16 


7E 


22 


3F 


34 


5F 


39 


9C 


41 
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CLAIMS 



1. An expression library comprising a repertoire of nucleic 
acid sequences, which sequences are not cloned from an 

5 immunised source, each nucleic acid sequence encoding at 

least part of a variable domain of a heavy chain derived 
from an immunoglobulin naturally devoid of light chains 
(VHH) wherein the extent of sequence variability in said 
library is enhanced compared to the corresponding naive 

10 expression library repertoire by the introduction of 

mutations in one or more of the complementarity determining 
regions (CDRs) of said nucleic acid sequences or by 
generating alternative combinations of CDR and framework 
sequences not naturally present in the naive library 

15 repertoire. 

2. A library according to claim 1, wherein the nucleic acid 
sequences are modified by random or partially random 
mutation of one or more of the CDR sequences. 

20 

3. A library according to claim 1 or claim 2, wherein at least 
part of one or more of the CDR sequences is replaced by 
synthetic nucleic acid sequences. 

25 4. A library according to any one of claims 1 to 3, wherein at 
least part of each of the CDR sequences is replaced by 
synthetic nucleic acid sequences. 

5. A library according to claim 1, wherein VHH sequences 
30 comprising alternative CDR and framework sequence 

combinations are generated by random recombination of 
fragments of VHH sequences derived from a naive library or 
a library according to any one of claims 2 to 4. 
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6. A library according to any one of claims 1 to 5, wherein 
conserved residues within the CDR sequences are retained in 
the modified nucleic acid sequences. 

5 7. A library according to any one of claims 1 to 6 wherein 
sequence variability present in the naive library 
repertoire in those parts of the VHH domain other than the 
CDRs is conserved. 

10 8. An expression library according to any one of claims 1 to 
7, wherein the variable domain of a heavy chain derived 
from an immunoglobulin naturally devoid of light chains is 
derived from a camelid immunoglobulin. 

15 9. Use of a naive VHH expression library to prepare an 
expression library according to any one of claims 1 to 8. 

10. A method for preparing a library according to any one of 
claims 1 to 8 comprising the steps: 

20 

(i) taking sequence data obtained from a number of cDNA 
clones randomly selected from a naive VHH library; 

(ii) identifying a series anchor regions which show 
25 substantial conserved homology within said sequence 

data and on which basis framework primers with a 
capacity to amplify framework regions of the naive 
library target DNA can be constructed; 

30 (iii) amplification from a non- immunised source of a 

maximal number of different framework regions using 
primers from step (ii) ; 

(iv) combining the DNA sequences encoding each CDR present 
35 in the naive library, optionally modified by mutation 
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or replacement, at least in part, by synthetic 
sequences, with framework primers to form a range of 
CDR primers; 

(v) assembling nucleic acid sequences to create a VHH 
repertoire by random recombination of the range of 
CDR primers with the amplified framework regions 
using a technique of splicing by overlap extension. 

11. Method for the preparation of antibody fragments derived 
from a non-immunised source having specificity for a target 
antigen comprising screening an expression library 
according to any of claims 1 to 9 for antigen binding 
activity and recovering antibody fragments having the 
desired specificity. 
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Step 5C 
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Fig.2. 
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